This sets out the approach for establishing good survey coverage within a monad or where further survey may be required. A survey day is defined as a day when over 40 species have been recorded within a monad.
A summary of the survey days data is shown below.With the exception of a few outliers where monads have been well recorded, the majority of monads have had low recorder day coverage, with 95% of monads having had 5 or fewer recorder days and 25% of the monads with 0 days cover.
## Eng_surveyEffort$SurveyDays1km_40
## n missing distinct Info Mean Gmd .05 .10
## 133416 0 41 0.888 1.291 1.726 0 0
## .25 .50 .75 .90 .95
## 0 1 2 3 5
##
## lowest : 0 1 2 3 4, highest: 40 43 47 101 126
These show the majority of monads have below 2 recording days, with the most well surveyed monad having over 126 recording days. Comparing this regionally, we can see some areas having been sampled better than others. however this will also be dependent upon the size of the defined region and the distance of monads from urban areas, with those closer to people living in urban areas showing greater survey coverage. This could explain why monads in london has comparatively a greater number of mean survey days than the rest of the country.
The total number of taxa recorded for each monad was extracted and compared to the number of survey days to establish how many survey days demonstrated good coverage of a monad.
These plots demonstrate a strong relationship between the total number of taxa and the survey days per monad. The results below show their correlation and relationship when modelled with a linear regression.
##
## Pearson's product-moment correlation
##
## data: monad_survey$SurveyDays1km_40 and monad_survey$total_freq
## t = 407.76, df = 120529, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.7590239 0.7637691
## sample estimates:
## cor
## 0.7614067
##
## Call:
## lm(formula = SurveyDays1km_40 ~ total_freq, data = monad_survey)
##
## Residuals:
## Min 1Q Median 3Q Max
## -10.123 -0.521 0.064 0.416 102.670
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4.690e-01 6.060e-03 -77.4 <2e-16 ***
## total_freq 1.759e-02 4.314e-05 407.8 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.347 on 120529 degrees of freedom
## (12885 observations deleted due to missingness)
## Multiple R-squared: 0.5797, Adjusted R-squared: 0.5797
## F-statistic: 1.663e+05 on 1 and 120529 DF, p-value: < 2.2e-16
This relationship predicts that 200 species are recorded after 3 recording days, 300 after roughly 5 recording days and 500 after 8 recording days.
Plotting these against each other by region we can see all the regions demonstrate this levelling off. The below range is restricted to just records with 30 survey days or less to make this easier to compare across regions and remove outliers. This shows after approximately 5 days this slower increase across sites in the taxa found where total number of taxa exceeds 250 and tends to level off. This doesn’t show huge variation between regions with overall similar numbers of taxa found between the different regions, averaging at 113 ± 47 species.
## # A tibble: 9 x 4
## RGN20NM meanFreq minFreq maxFreq
## <chr> <dbl> <int> <int>
## 1 East Midlands 110. 1 546
## 2 East of England 99.0 1 710
## 3 London 222. 2 724
## 4 North East 63.0 1 564
## 5 North West 89.6 1 613
## 6 South East 124. 1 1353
## 7 South West 140. 1 638
## 8 West Midlands 85.2 1 554
## 9 Yorkshire and The Humber 82.2 1 595